output: html_document: default pdf_document: default

title: ‘INFO 8000 Assignment #3/ Filtering through Turbulence Reports from January’

Install Packages(Must comment out install.package when publishing through knit)

#install.packages(c("ggplot2","devtools", "dplyr", "stringer", "tidyverse"))
#install.packages(c("maps", "mapdata"))
library(stringr)
library(devtools)
library(ggplot2)
library(ggmap)
library(maps)
library(mapdata)
library(plyr)
## 
## Attaching package: 'plyr'
## The following object is masked from 'package:maps':
## 
##     ozone
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:plyr':
## 
##     arrange, count, desc, failwith, id, mutate, rename, summarise,
##     summarize
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(readr)
library(tidyverse)
## Loading tidyverse: tibble
## Loading tidyverse: tidyr
## Loading tidyverse: purrr
## Conflicts with tidy packages ----------------------------------------------
## arrange():   dplyr, plyr
## compact():   purrr, plyr
## count():     dplyr, plyr
## failwith():  dplyr, plyr
## filter():    dplyr, stats
## id():        dplyr, plyr
## lag():       dplyr, stats
## map():       purrr, maps
## mutate():    dplyr, plyr
## rename():    dplyr, plyr
## summarise(): dplyr, plyr
## summarize(): dplyr, plyr

Read in CSV File of Pilot Reports

PIREPS <- read_csv("C:/Users/Nick Morgan/Desktop/stormattr_201701010000_201705010000 (1).csv")
## Parsed with column specification:
## cols(
##   VALID = col_double(),
##   URGENT = col_logical(),
##   AIRCRAFT = col_character(),
##   REPORT = col_character(),
##   LAT = col_character(),
##   LON = col_character()
## )
## Warning in rbind(names(probs), probs_f): number of columns of result is not
## a multiple of vector length (arg 1)
## Warning: 3904 parsing failures.
## row # A tibble: 5 x 5 col     row   col  expected    actual expected   <int> <chr>     <chr>     <chr> actual 1    77  <NA> 6 columns 7 columns file 2   129  <NA> 6 columns 7 columns row 3   131  <NA> 6 columns 7 columns col 4   148  <NA> 6 columns 7 columns expected 5   158  <NA> 6 columns 7 columns actual # ... with 1 more variables: file <chr>
## ... ................. ... ................................. ........ ................................. ...... ................................. .... ................................. ... ................................. ... ................................. ........ ................................. ...... .......................................
## See problems(...) for more details.
cols(
  VALID = col_double(),
  URGENT = col_logical(),
  AIRCRAFT = col_character(),
  REPORT = col_character(),
  LAT = col_character(),
  LON = col_character()
)
## cols(
##   VALID = col_double(),
##   URGENT = col_logical(),
##   AIRCRAFT = col_character(),
##   REPORT = col_character(),
##   LAT = col_character(),
##   LON = col_character()
## )

Questions:

1.) What type of commmerical aircraft has the most amount of Pilot Reports?

(Create a variable that only has aircraft, then filtered through the data to find the aircraft that has the most reports)

AircraftOnly <- PIREPS %>%
  select(AIRCRAFT)%>%
  filter(AIRCRAFT=='B717'|AIRCRAFT=='B737'|AIRCRAFT=='B747'|AIRCRAFT=='B757'|AIRCRAFT=='B767')
  DeltaPlanes<-table(AircraftOnly)
  DeltaPlanes2<-data.frame(DeltaPlanes)
  barplot(DeltaPlanes2$Freq, main = "Reports by Type of Aircraft", xlab = "Aircraft Type", ylab = "Number of Reports", ylim = c(0,40000), names.arg = c("B717","B737", "B747", "B747", "B757"))

This information is useful as the size of the aircraft has an affect on the turbulence that the pilot will feel. The smaller the plane, the lower the threshold is for Extreme turbulence compared to its counterpart.

2.) Where are the majority of these pilot reports coming from?

(Read in a filtered CSV (deleted corrupt Lat/Longs), download a map of the United States, plot the latitude and longitude points of each reports, create density map.)

PIREPS2<-read_csv("C:/Users/Nick Morgan/Desktop/Filteredstormattr_201701010000_201705010000.csv")
## Warning: Missing column names filled in: 'X5' [5], 'X6' [6], 'X7' [7],
## 'X8' [8]
## Parsed with column specification:
## cols(
##   AIRCRAFT = col_character(),
##   REPORT = col_character(),
##   LAT = col_double(),
##   LON = col_double(),
##   X5 = col_character(),
##   X6 = col_character(),
##   X7 = col_character(),
##   X8 = col_character()
## )
LatLong <-data.frame(PIREPS2) %>%
  select(LAT,LON)  %>%
  filter(LAT>="0" & LON<="0")
USA<-get_map('usa', zoom=4)
## Map from URL : http://maps.googleapis.com/maps/api/staticmap?center=usa&zoom=4&size=640x640&scale=2&maptype=terrain&language=en-EN&sensor=false
## Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=usa&sensor=false
ggmap(USA)+
  stat_density2d(
    aes(x = LON, y = LAT, fill = ..level.., alpha = ..level..,show_guide=FALSE),
    data = LatLong,
    geom = "polygon",
    bins = 20)+
  scale_fill_gradient(low = "black", high = "red") +                           #change display colors
  theme(legend.position="none")+                                                    #no legend
  ggtitle("Density of Pilot Reports")+                                          #title name
  theme(plot.title = element_text(lineheight=3.5, face="bold"))                  #title graphics
## Warning: Ignoring unknown aesthetics: show_guide
## Warning: Removed 24657 rows containing non-finite values (stat_density2d).

This information is useful as a turbulence researcher, as I can find areas that arent receiving reports as much. For areas that arent receiving as many reports, I know to rely on satellite and radar data for turbulence diagnostics.

3.) Of those reports, which reports had turbulence reported as well?

(Utilize stringr package to filter through the “Report” column in the PIREP database, looking for reports with the words, “TB”. “TB” indicates turbulence on the flight, as reported by the pilot Use the USA map we downloaded earlier, add lat/long to map of reports indicating turbulence)

library(stringr)
testforturb<-data.frame(PIREPS, stringsAsFactors = FALSE) %>%
  filter(str_detect(PIREPS$REPORT,"TB"))
testforturb$LON <- as.numeric(as.character(testforturb$LON))
## Warning: NAs introduced by coercion
testforturb$LAT <- as.numeric(as.character(testforturb$LAT))
## Warning: NAs introduced by coercion
ggmap(USA)+ggtitle("Pilot Reports Indicating Turbulence")+
  geom_point(data = testforturb, aes(x = LON, y = LAT), color= "red", alpha = 0.1, size = 1)
## Warning: Removed 14768 rows containing missing values (geom_point).

This is one of the more important graphics I can make, this displays every pilot report that indicates turbulence, on any scale. With these reports, I can investigate the time frames of these reports to find turbulence outbreaks that might not have previously been detected.

4.) How many of these reports where of “Moderate” Intensity Turbulence

(Similiar to 3, except, looking for reports regarding Moderate turbulence Utilizing the stringr package.)

MODTurb<-data.frame(PIREPS, stringsAsFactors = FALSE) %>%
  filter(str_detect(PIREPS$REPORT,"MOD"))
MODTurb$LON <- as.numeric(as.character(MODTurb$LON))
## Warning: NAs introduced by coercion
MODTurb$LAT <- as.numeric(as.character(MODTurb$LAT))
## Warning: NAs introduced by coercion
ggmap(USA)+ggtitle("Pilot Reports Indicating MODERATE Turbulence")+
  geom_point(data = MODTurb, aes(x = LON, y = LAT), color= "black", alpha = 0.1, size = 1)
## Warning: Removed 5739 rows containing missing values (geom_point).

This code allows me to filter through the dataset by the intensity of the reported Turbulence. This can be changed to any intensity that I would like to investigate, which in turn gives me locations and positions of these reports.

5.) Can I find turbulence reports given a specific flight height?

(Add an additional filter to the TB filter. Add a flight level from the report. Use the USA map from earlier, plots lat/long of reports indicating reports with turbulence at 3000ft)

Turbatheight<-data.frame(PIREPS, stringsAsFactors = FALSE) %>%
  filter(str_detect(PIREPS$REPORT,"FL030"),str_detect(PIREPS$REPORT, "TB"))
Turbatheight$LON <- as.numeric(as.character(Turbatheight$LON))
## Warning: NAs introduced by coercion
Turbatheight$LAT <- as.numeric(as.character(Turbatheight$LAT))
## Warning: NAs introduced by coercion
ggmap(USA)+ggtitle("Pilot Reports Indicating Reports of Turbulence at 3,000ft")+
  geom_point(data = Turbatheight, aes(x = LON, y = LAT), color= "black", alpha = 1, size = 1)
## Warning: Removed 629 rows containing missing values (geom_point).

Turbulence at differing flight levels is an interesting topic within the aviation industry. Different types of turbulence (shear, vorticity, etc.) occur at different levels throughout the atmosphere. With this, I can find reports of turbulence at a specific flight level at a specific location. This allows me to investigate what type of turbulence was occuring at a flight level (with the help of radar and satellite)

This would be useful in research, as I could select variables and display data rather quickly. I would be able to look for hotspots in turbulence and reports as well as look for turbulence outbreaks during a specific time period and flight height.